<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head><title>R: Yearly batting records for all major league baseball players</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link rel="stylesheet" type="text/css" href="R.css">
</head><body>

<table width="100%" summary="page for baseball"><tr><td>baseball</td><td align="right">R Documentation</td></tr></table>

<h2>Yearly batting records for all major league baseball players</h2>

<h3>Description</h3>


<p>This data frame contains batting statistics for a subset
of players collected from
<a href="http://www.baseball-databank.org/">http://www.baseball-databank.org/</a>. There are a
total of 21,699 records, covering 1,228 players from 1871
to 2007. Only players with more 15 seasons of play are
included.
</p>


<h3>Format</h3>

<p>A 21699 x 22 data frame</p>


<h3>Variables</h3>


<p>Variables: </p>
 <ul>
<li><p> id, unique player id </p>
</li>
<li>
<p>year, year of data </p>
</li>
<li><p> stint </p>
</li>
<li><p> team, team played
for </p>
</li>
<li><p> lg, league </p>
</li>
<li><p> g, number of games </p>
</li>
<li><p> ab,
number of times at bat </p>
</li>
<li><p> r, number of runs </p>
</li>
<li><p> h,
hits, times reached base because of a batted, fair ball
without error by the defense </p>
</li>
<li><p> X2b, hits on which the
batter reached second base safely </p>
</li>
<li><p> X3b, hits on
which the batter reached third base safely </p>
</li>
<li><p> hr,
number of home runs </p>
</li>
<li><p> rbi, runs batted in </p>
</li>
<li><p> sb,
stolen bases </p>
</li>
<li><p> cs, caught stealing </p>
</li>
<li><p> bb, base on
balls (walk) </p>
</li>
<li><p> so, strike outs </p>
</li>
<li><p> ibb, intentional
base on balls </p>
</li>
<li><p> hbp, hits by pitch </p>
</li>
<li><p> sh,
sacrifice hits </p>
</li>
<li><p> sf, sacrifice flies </p>
</li>
<li><p> gidp,
ground into double play </p>
</li></ul>



<h3>References</h3>


<p><a href="http://www.baseball-databank.org/">http://www.baseball-databank.org/</a>
</p>


<h3>Examples</h3>

<pre>
baberuth &lt;- subset(baseball, id == "ruthba01")
baberuth$cyear &lt;- baberuth$year - min(baberuth$year) + 1

calculate_cyear &lt;- function(df) {
  mutate(df,
    cyear = year - min(year),
    cpercent = cyear / (max(year) - min(year))
  )
}

baseball &lt;- ddply(baseball, .(id), calculate_cyear)
baseball &lt;- subset(baseball, ab &gt;= 25)

model &lt;- function(df) {
  lm(rbi / ab ~ cyear, data=df)
}
model(baberuth)
models &lt;- dlply(baseball, .(id), model)
</pre>


</body></html>
